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Abstract 

Contemporary linguistic theories (in particular, 
HPSG) are declarative in nature: they specify 
constraints on permissible structures, not how 
such structures are to be computed. Gram- 
mars designed under such theories are, there- 
fore, suitable for both parsing and generation. 
However, practical implementations of such the- 
ories don't usually support bidirectional pro- 
cessing of grammars. We present a grammar 
development system that includes a compiler of 
grammars (for parsing and generation) to ab- 
stract machine instructions, and an interpreter 
for the abstract machine language. The genera- 
tion compiler inverts input grammars (designed 
for parsing) to a form more suitable for genera- 
tion. The compiled grammars are then executed 
by the interpreter using one control strategy, re- 
gardless of whether the grammar is the original 
or the inverted version. We thus obtain a uni- 
fied, efficient platform for developing reversible 
grammars. 

1 Introduction 

The popularity of contemporary linguistic for- 
malisms such as Lexical Functional Grammar 
(Kaplan &; Bresnan 82), Categorial Grammar 
(Haddock et al. 87) or Head-Driven Phrase- 
Structure Grammar (HPSG) (Pollard & Sag 94), 
and especially their mathematical and formal ma- 
turity, have led to the development of various 
frameworks, applying different methods, for their 
implement at ion . 

This paper focuses on a computational frame- 
work in which HPSG grammars can be devel- 
oped. A wide spectrum of implementation tech- 
niques for HPSG exist: one extreme is direct in- 
terpretation of grammars. For parsing, this in- 
volves a program that accepts as input a gram- 

* In R. Mitkov, N. Nicolov and N. Nikolov, eds., Pro- 
ceedings of "Recent Advances in Natural Language Process- 
ing" (RANLP'97), pp. 135-142, Tzigov Chark, Bulgaria, 
11-13 September 1997 

t Supported by the Minerva Stipendien Komitee. 

* Supported by a grant from the Israeli Ministry of Sci- 
ence: "Programming Languages Induced Computational 
Linguistics" and the Fund for the Promotion of Research 
in the Technion. We thank an anonymous reviewer for 
useful comments. 



mar and a string and parses the string according 
to the grammar. For generation, the input is a 
semantic formula from which a phrase is gener- 
ated. The earliest HPSG parsers (e.g., (Prudian 
& Pollard 85; Franz 90)) were designed in this 
way. A slightly more elaborate technique is the 
use of some high-level, unification-based logic pro- 
gramming language (e.g., Prolog or LIFE (Ai't- 
Kaci & Podelski 93)) for specifying the grammar. 
Further along this line lies compilation of gram- 
mars directly into Prolog, using Prolog's internal 
mechanisms for performing unification. This is 
the implementation technique of, e.g., Profit (Er- 
bach 94). Systems such as ale (Carpenter 92a; 
Carpenter & Penn 95) also compile grammars into 
Prolog. However, ale compiles grammar descrip- 
tions directly into Prolog code, rather than into (a 
Prolog representation of) feature structures. At 
run time, ALE executes the code that was com- 
piled for the rules. Parts of the unifications (re- 
sulting from type-unification) are performed at 
compile-time to increase the efficiency of the gen- 
erated code. 

In this paper we advocate a further step along 
the same spectrum. We propose ^4malia, an ab- 
stract machine specifically designed for executing 
ALE grammars (without relational extensions). 
^iMALIA includes a compiler of input grammars 
into the abstract machine language and an inter- 
preter for the abstract instructions. This imple- 
mentation technique was proved useful for many 
programming languages, most notably Prolog it- 
self 1 , and as we show below, it improves the ef- 
ficiency of parsing with ALE grammars consider- 
ably. We emphasize in this paper the more prac- 
tical aspects of the system, focusing on the in- 
tegration of parsing and generation, as its the- 
oretical infra-structure has been presented else- 
where (Wintner & Francez 95; Wintner 97). 
From the point of view of grammar engineer- 
recently, such techniques were used for implementing 
the new programming language Java. 



ing, the abstract machine approach has an ad- 
ditional advantage. ^Imalia's compiler incorpo- 
rates an algorithm, based on (Samuelsson 95), for 
inverting grammars (designed for parsing) into a 
form more suitable for generation. The compiler 
then produces code for the inverted grammar, us- 
ing exactly the same machine language. Thus, 
the same grammar can be compiled to two differ- 
ent object programs for the two different tasks. 
The interpreter executes both kinds of programs 
in the same way - only the initialization of the 
machine's state and the format of the final re- 
sults differ. We thus obtain a uniform platform 
for developing grammars serving both for parsing 
and for generation. 

We discuss the use of abstract machine tech- 
niques for compilation in the next section, and 
sketch the algorithm that inverts a grammar for 
generation in Section 3. Section 4 explains the 
dual operation of the abstract machine, and Sec- 
tion 5 lists some implementation details. 

2 Why abstract machines? 

High-level programming languages with dynamic 
structures have always been hard to develop com- 
pilers for. A common technique for overcoming 
the problems involves the notion of an abstract 
machine. It is a machine that, on one hand, 
captures the essentials of the high-level language 
in its architecture and instruction set, such that 
compiling from the source language to the (ab- 
stract) machine language becomes relatively sim- 
ple. On the other hand, the architecture must 
be simple enough for the abstract machine lan- 
guage to be easily interpretable on common ma- 
chines. This technique also facilitates portable 
front ends for compilers: as the machine language 
is abstract, it can be easily interpreted on differ- 
ent (concrete) machines/platforms. 

Abstract machines were used for various pro- 
cedural and functional languages, but they be- 
came prominent for logical programming lan- 
guages since the introduction of the Warren Ab- 
stract Machine (WAM) (Warren 83; Ai't-Kaci 91) 
for Prolog. While Prolog has gained a recogni- 
tion as a practical implementation of the idea of 
programming in logic, a method for interpreting 
the declarative logical statements was needed for 
such an implementation to be well-founded. Even 
though there were prior attempts to construct 
both interpreters and compilers for Prolog, it was 



the WAM that gave the language not only a good, 
efficient compiler, but, perhaps more importantly, 
an elegant operational semantics. 

The WAM immediately became the starting 
point for many compiler designs for Prolog. The 
techniques it delineates serve not only for Pro- 
log proper, but also for constructing compilers 
for related languages: parallel Prolog compilers, 
variants of Prolog that use different resolution 
methods, extend Prolog with types or with record 
structures, etc. An additional advantage of ab- 
stract machines is that they are a useful tool in 
formally verifying the correctness of compilers. 

3 Inverting grammars for generation 

One of the attractions of declarative linguistic 
theories such as HPSG is that a single grammar, 
formulated in the theory, can be used both for 
parsing and for generation. While this is true in 
theory, not many practical implementations of lin- 
guistic formalisms support bidirectional grammar 
processing. Many advantages of bidirectional nat- 
ural language systems are listed in (Strzalkowski 
94), where three options for reversibility are con- 
sidered (pp. xiii-xxi): (1) A grammar is compiled 
into two separate programs, parser and generator, 
requiring a different evaluation strategy; (2) The 
parser and the generator are separate programs, 
executed using the same evaluation strategy; (3) 
The parser and the generator are one program, 
and the evaluation strategy can handle it being 
run in either direction. Our solution falls into the 
second category: there is only one input grammar, 
which is compiled into two different (abstract ma- 
chine) object programs; these two programs are 
executed using exactly the same mechanism, the 
interpreter, and hence employ the same strategy. 
This guarantees both ease of grammar develop- 
ment and maintenance and no loss of efficiency 

Grammars are usually oriented towards the 
analysis of a string and not towards generation 
from a (usually nested) semantic form. In other 
words, rules reflect the phrase structure and not 
the predicate-argument structure. It is therefore 
desirable to transform the grammar in order to 
enable systematic reflection of any given logical 
form in the productions. To this end we apply 
an inversion procedure, based upon 2 (Samuelsson 

2 Samuelsson's inversion algorithm was developed for 
definite clause grammars (Pereira & Warren 80). We 
ported it to a typed feature-structure framework. 
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95), to render the rules with the nested predicate- 
argument structure, corresponding to that of in- 
put logical forms. Once the grammar is inverted, 
the generation process can be directed by the in- 
put semantic form; elements of the input are con- 
sumed during generation just like words are con- 
sumed during parsing. 

Figure 1 depicts a simple example grammar in 
ALE format 3 (prd stands for predicate, a for ar- 
gument, var for variable, rst for restriction and 
conn for connective). The first rule creates a sen- 
tence (S) out of a noun phrase (NP) and a verb 
phrase (VP). The semantics of the S (denoted by 
the variable R6) is obtained by applying the se- 
mantics of the NP (XR5.R6) to that of the VP. In 
the same way, the second rule, combining a deter- 
miner (DET) with a noun (N) to obtain an NP, 
applies the meaning of the DET to that of the N 
to obtain (after two /3-reductions that are incor- 
porated into the rule) the meaning of the NP. The 
lexical entries of three words are shown as well. 

Figure 2 depicts (part of) the same grammar 
after inversion. The inverted grammar reflects 
the semantic argument structure, not the phrase 
structure. For example, the first rule creates 
a sentence, whose sem feature corresponds to 
VR5.(B8(R5) -> RW(Rb)), from three compo- 
nents: an N (i?8), a VP (RIO) and a semantic 
head, i?3. The string generated by the S, en- 
coded as the value of the str feature (see below), 
is the concatenation of the strings generated by 
the head, the N and the VP. For such rules to 
be applicable, the lexicon has to be inverted, too: 
the "words" of the inverted grammar are atomic 
semantic formulae. The last three rules add syn- 
tactic information to the semantics encoded in the 
primitives. In addition to these inverted rules, a 
semantic knowledge base is generated, associat- 
ing semantic primitives with words. It is used in 
the final stage of the generation, when the actual 
words are generated. 

Grammars must satisfy certain requirements in 
order for them to be invertible. However, the re- 
quirements are not overly restrictive and allow en- 
coding of a variety of natural language grammars. 
In particular, the semantics must be encoded by 
predicate-argument structures. What the inver- 
sion in fact achieves is restructuring of a gram- 
mar; this enables effective treatment of the nested 
structure of logical forms, so that the resulting 

3 The signature is omitted for lack of space. 



grammar is inherently suitable for generation. 

Grammar inversion is performed as part of the 
compilation. The given grammar is enhanced in 
a way that will ultimately enable to reconstruct 
the words spanned by the semantic forms. To 
achieve this aim, each rule constituent is extended 
by an additional special-purpose feature (str in 
the example grammar) . The value of this feature 
for the rule's head is set to the concatenation of 
its values in the body constituents, to reflect the 
original phrase structure of the rule. 

Among the other advantages of the abstract 
machine approach mentioned above, this tech- 
nique gives an express solution for the termina- 
tion problem. It is usually difficult to define when 
generation terminates, but once the query is given 
as a sequence of semantic components, they are 
consumed in a linear manner. While generation, 
just like parsing, is not guaranteed to terminate, 
the termination criteria of parsing apply for our 
generation scheme. In other words, generation in 
our system can be viewed as parsing ('consum- 
ing') input sequences of meaning components. 

4 Unified parsing and generation 

^Imalia employs a bottom-up chart based con- 
trol unit, where rules are evaluated from left to 
right. The chart is used for storing active and 
complete edges. The latter are represented as 
pointers to feature structures; the former con- 
sist of a sequence of such pointers (for the part 
of the edge prior to the dot) and a pointer to 
the compiled code (for the part succeeding the 
dot). For parsing, edges span a sub-sequence of 
the input string, assigning it some structure. For 
generation, edges span a sub-form of the input 
semantic form, also assigning it a structure that 
eventually determines a phrase whose meaning is 
that sub-form. It must be noted that at run-time 
there is no notion of the particular task (pars- 
ing/generation) performed by the machine, and 
the effect of the machine instructions is the same 
for both tasks. 

^iMALIA's operation for generation differs from 
parsing only in initialization and interpretation 
of the results. For parsing, the input is a string 
of words. Each word is looked up in the lexi- 
con, and its associated feature structure (or fea- 
ture structures, in case the word is ambiguous) 
is entered in the main diagonal of the chart as 
a complete edge. Thus, for the example gram- 
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(phrase, syn: (syn, cat:s), sem: (R6, sem)) 

cat> (phrase, syn: (syn, cat:np), sem: (lambda, (var:R5, rst:(R6, funct)))), '/, head 
cat> (phrase, syn: (syn, cat:vp), sem: (lambda, (var:R7, rst:(R5, funct)))). 

(phrase, syn: (syn, cat:np), sem:(R6, sem)) 

cat> (phrase, syn: (syn, cat:det), sem: (lambda, (var:R5, rst:(R6, funct)))), '/, head 
cat> (phrase, syn: (syn, cat:n), sem: (lambda, (var:R7, rst:(R5, funct)))). 

every > 

(word, syn: (syn, cat:det), 
sem: (lambda, var:R5, 

rst: (lambda, var:R6, 

rst : (prd: (f orall , var:R2, form:(bool, conn:if, 

wffl:(R5, al:R2), 
wff2:(R6, al:R2))), 

al:R5, a2:R6)))). 

boy — > 

(word, syn: (syn, cat:n), sem: (lambda, var:R5, rst : (prd:boy , al:R5))). 
sleeps > 

(word, syn: (syn, cat:vp), sem: (lambda, var:R5, rst : (prd: sleep, al:R5))). 



Figure 1: A simple grammar 

(phrase, syn: (syn, cat : s) , str : [R3,R19,R22] , 

sem:(R3, prd:(forall, var:R5, form: (conn: if , 

wffl: (R8,al:R5) , 
wff2: (R10,al:R5))) , 

al:R8, a2:R10)) 

(phrase, syn:cat:n, sem: (lambda, rst:R8), str:R19), 
(phrase, syn:cat:vp, sem: (lambda, rst:R10), str:R22), 
(lambda, var:R8, rst: (lambda, var:R10, rst:R3)). 

(word, syn:cat:n, sem:R3, str:[R5]) 

(R3, lambda, var:R4, rst:(R5, prd:noun, al:R4)). 

(word, syn:cat:vp, sem:R3, str:[R5]) 

(R3, lambda, var:R4, rst:(R5, prd: v_intrans , al:R4)). 

(word, syn:cat:det, sem:R3, str: [RIO]) 

(R3, lambda, var : (R4,al :R6) , 

rst: (lambda, var:(R8, al:R6), 

rst: (RIO, prd: (forall, var:R6, form: (conn: if , wffl:R4, wff2:R8)) : 
al:R4, a2:R8))). 



Figure 2: The inverted grammar (partial) 



mar and the input "every boy sleeps", the items 
in the [0, 1], [1, 2], [2, 3] entries of the chart are as 
depicted in Figure 3. 

For generation, the input is a semantic form, 
represented as (an ALE description of) a feature 
structure. The chart is initialized with (com- 
plete) edges that correspond to elements in the 
input semantic form, rather than to words. For 
example, if the input is (a feature structure en- 
coding of) Mx(boy(x) — > sleep(x)), the items in 



the [0, 1], [1, 2], [2, 3] entries of the chart are as 
depicted in Figure 4. The first item encodes 
Xx.boy(x); the second - Xx.sleep(x); and the third 
- \P.\Q>Jx{P{x) -> Q(x)). 

It must be clear that there doesn't have to be 
a 1 — 1 correspondence between the initial states 
of the chart in both tasks. The semantic input is 
scanned and its elements are (recursively) selected 
in a pre-defined order that is induced by the re- 
structuring of the grammar rules (in particular, 
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Figure 3: Initial chart entries, parsing 

arguments precede the predicate). 

Once the chart is initialized, the same process- 
ing strategy is applied independently of the task: 
the compiled program is executed on the input. 
The basic operation performed by the object pro- 
grams is unification, which is needed for both 
tasks. Unification implements the dot movement 
operation that lies in the heart of chart-based 
parsing and generation. However, dot movement 
is interpreted differently for both tasks, since the 
(compiled) grammar rules are different: for pars- 
ing, dot movement goes over a sub-part of the 
input phrase; for generation, it covers a part of 
the input logical form. 

Consider the effect of dot movement for pars- 
ing: assume that an active edge corresponding to 
the second rule with the dot in the initial position 
is applied to the lexical entry of "every" , present 
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Figure 4: Initial chart entries, generation 

in [0,1]. The compiled code of the second rule 
is executed on "every"; some trivial unifications 
take place, but the more interesting ones bind R5 
of the rule to the value of the tag [T| in the lexical 
entry, and R6 - to the value of the path sem:rst. 
A new active edge is created, with these bindings 
recorded, and entered in [0,1]. The part of the 
edge following the dot points to the second cate- 
gory in the body of this rule. Assume further that 
this edge is combined with (the complete edge 
that is) the lexical entry of "boy" . Several trivial 
unifications take place, but the interesting ones 
bind R7 in the rule to the tag \T\ in "boy" , and R5 
of the rule to the value of sem:rst in "boy" . Due 
to reentrancies among the rule's constituents, the 
obtained (complete) edge (spanning [0,2]), whose 
sem feature indeed encodes the semantics of "ev- 
ery boy" {\QNx{boy{x) — > Q(x))), is as depicted 
in Figure 5. 

Next, we give a scenario of a generation process. 
It is easy to see how the last three rules of the in- 
verted grammar are applicable to the three lexical 
entries of Figure 4, respectively. Assume an ac- 
tive edge corresponding to the first rule is present 
in [0, 0], with the dot in the initial position. Two 
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Figure 5: Parsing (intermediate) result 

dot movements, over the first two elements in the 
body of this rule, bind R8 to the value of rst in 
the lexical entry of X(x).boy(x), and RIO - to the 
value of rst in X(x).sleep(x). An active edge, with 
the dot in the penultimate position, is obtained in 
[0,2]. The next dot movement applies (the code 
that was generated for) the last body element of 
the rule to the lexical entry residing in [2,3]. R8 
of the rule is unified with the value of the tag \T\ 
in this entry; since R8 was bound by previous uni- 
fications, the value of prd is set to boy. RIO of the 
rule is unified with the value of [T|, and the sec- 
ond predicate is set to sleep. Finally, R3 is unified 
with the value of rst:rst in the lexical entry; the 
complete edge created, spanning the entire input, 
is depicted in Figure 6. 

The chart algorithm ends up with a (possibly 
empty) set of feature structures, spanning the en- 
tire input: these are all the complete edges deriv- 
able from the input and the grammar rules (there 
is no notion of an initial symbol). Of course, if the 
grammar is such that an infinite number of deriva- 
tions can be produced, computations might not 
terminate (^.MALIA does not incorporate a sub- 
sumption check to test for spurious ambiguity). 
For parsing, the results depict different structures 
of the input string. Ideally, they contain some 
representation of the string's semantics. This is 
also true for generation, with a slight difference: 
according to the grammar inversion algorithm, 
each resultant structure is guaranteed to have a 
feature (namely, str) that encodes a list of words, 
comprising the phrase generated. As can be seen 



Figure 6: Generation result 

in the example (Figure 6), the value of this feature 
is not a list of words but rather a list of feature 
structures, each of which corresponds to (i.e., is 
subsumed by) a lexical entry in the inverted gram- 
mar. A final post-processing stage generates all 
the possible strings using this list and the seman- 
tic knowledge base. 

5 Implementation 

This section describes the input language for 
^iMALIA grammars and touches on some imple- 
mentation details. In particular, it discusses the 
differences between _4malia and ale in terms of 
expressiveness and efficiency. 

_4malia supports the same type hierarchies as 
ALE does, with exactly the same specification syn- 
tax. This means that the user can specify any 
bounded-complete partial order as the type hi- 
erarchy. Only immediate sub-types are specified, 
and the reflexive-transitive closure of the sub-type 
relation is computed automatically by the com- 
piler. The special type bot must be declared as 
the unique most general type. 

Appropriateness, too, is specified using ale's 
syntax, by listing features at the type they are in- 
troduced by. The feature introduction condition 
must be obeyed: every feature must be introduced 
by some most general type, and is appropriate for 
all its sub-types. However, _4malia allows ap- 
propriateness loops 4 in the type hierarchy. Type 
constraints are not supported by ^.MALIA. 

4 Appropriateness loops are handled by employing lazy 
evaluation techniques at run-time. 
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^4malia uses a subset of ale's syntax for de- 
scribing feature structures. As a rule, whenever 
_4malia supports ale's functionality, it uses the 
same syntax. In general, ^4malia supports to- 
tally well-typed, possibly cyclic, non-disjunctive 
feature structures. Set values, as in ALE, are 
not supported, but list values are. _4malia does 
not respect the distinction between intensional 
and extensional types (Carpenter 92b, Chapter 
8). Also, feature structures cannot incorporate 
inequality constraints. 

The semantics of the logical descriptions, as 
well as the operator precedence, follow ale. As 
in ALE, partial descriptions are expanded at com- 
pilation time. _4malia's compiler performs type 
inference on partial descriptions, reports any in- 
consistencies, and then creates code for the ex- 
panded structures. To avoid infinite processing in 
the face of appropriateness loops (where no finite 
totally well-typed structure that satisfies the de- 
scription might exist), the compiler stops expand- 
ing a structure if it is the most general structure 
of its type. 

ALE includes a built-in definite logic program- 
ming language; ^4malia does not. The entire 
power of definite clause specifications is missing in 
^Imalia. However, a few common functions that 
are external to the feature structure formalism 
were added to the system, and grammar specifi- 
cations can use them. These features are referred 
to as goals, although it must be remembered that 
they are far weaker than ale's goals. 

^iMALIA preserves ALE's syntax in describing 
lexical entries. Multiple lexical entries may be 
provided for each word, separated by semicolons. 
It also keeps ale's syntax in the definition of 
empty categories (or e-rules). In contrast to ALE, 
^4malia processes empty categories at compile 
time. Each empty category is matched by the 
compiler against each element in the body of ev- 
ery rule; if the unification succeeds, a new rule is 
added to the grammar, based upon the original 
rule, with the matched element removed. Some 
limitations apply for this process (which in the 
general case is not guaranteed to terminate), and 
therefore the resulting grammar might not be 
equivalent to the original one. 

^iMALIA supports macros in a similar way to 
ale. The syntax is the same, and macros can have 
parameters or call other macros (though not re- 
cursively, of course), ale's special macros for lists 



are supported by _4malia. Lexical rules are not 
supported in this version of „4MALIA. ^iMALIA's 
syntax for phrase structure rules is similar to 
ale's, with the exception of the cats> specifi- 
cation (permitting a list of categories in the body 
of a rule) which is not supported. 

The design details of the abstract machine are 
outside the scope of this paper; the reader is re- 
ferred to (Wintner & Francez 95; Wintner 97) 
for more information on the machine itself and 
to (Gabrilovich 97) for a detailed description of 
the grammar inversion. A practical description of 
^4malia, its deviations from ale and a complete 
user's guide, are given in (Wintner et al. 97). 

^4malia is implemented in C, augmented by 
yacc, lex and Tcl/Tk (Ousterhout 94). It was 
tested on various Unix platforms and on IBM 
PCs. Two versions of ^4malia exist: an inter- 
active, easy-to-use, graphically interfaced system 
and a text-oriented, non-interactive one. The for- 
mer is intended for developing prototype gram- 
mars; the latter is far more efficient but less user- 
friendly, and is intended to be used for batch 
processing. In addition, the system functions as 
a graphical development framework for grammar 
engineers by providing some tracing and debug- 
ging options. The user can direct the system to 
execute a program in its entirety, to break at a 
certain instruction or to proceed in steps, stop- 
ping after each executed instruction. Throughout 
the process of grammar execution, the abstract 
machine's internal state is displayed for the user 
to inspect. The main data structure upon which 
feature structures are being built, the heap, is dis- 
played, along with the machine's general-purpose 
and special-purpose registers. Moreover, the con- 
tents of the chart can be graphically displayed at 
any time and the derived structure can be recov- 
ered. Grammar development becomes an easier, 
simpler process. 

The system was tested with a wide variety of 
grammars, mostly adaptations of existing ALE 
grammars. While most of the example grammars 
are rather small, we believe that the system can 
handle real-scale grammar quite efficiently; how- 
ever, to accommodate large type hierarchies some 
major space optimizations must be introduced. 
It is important to emphasize that ^Imalia does 
not provide the wealth of input specifications ale 
does. On the other hand, development of gram- 
mars in ^4malia is made easier due to the GUI 
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and the improved performance over ale. The 
support of generation is unique to our system. 

To compare „4malia with ale we have used a 
few benchmark grammars. The first is an early 
version of an HPSG-based Hebrew grammar de- 
scribed in (Wintner 97). It consists of 4 rules and 
one empty category; the type hierarchy contains 
84 types and 32 features, and the lexicon contains 
13 words. The second is an HPSG-based gram- 
mar for a subset (emphasizing relative clauses) of 
the Russian language described in (Gabrilovich & 
Estrin 96). It consists of 8 rules and 76 lexical en- 
tries; the type hierarchy contains 151 types and 31 
features. The third example is a simple grammar 
generating the language {a n b n \ n > 0}. Both 
systems were used to compile the same grammar 
and to parse the same strings. The results of a 
performance comparison of ^4malia and ale are 
listed in Figure 7 (all times are in seconds; n in- 
dicates the input string's length and r - the num- 
ber of results). While the execution times for the 
last grammar are less impressing, the differences 
in compilation time indicate a major advantage 
in using „4MALIA for instructional purposes; in 
such cases grammars are compiled over and over 
again, while they are usually executed only a few 
times. Limited experiments we have conducted 
reveal that generation (as well as compilation for 
generation) is 40%-100% slower than parsing (we 
do not know of good benchmarks for generation). 



task 


ALE 


ylMALIA 


Grammar 1 


Compilation 


35.0 


1.4 


Parsing, n=6, r=2 


0.5 


0.5 


Parsing, n=10, r=8 


3.2 


0.8 


Parsing, n=14, r=125 


140.0 


9.0 


Grammar 2 


Compilation 


68.0 


2.3 


Parsing, n=2, r=2 


0.5 


0.8 


Parsing, n=4, r=2 


2.4 


0.9 


Parsing, n=7, r=2 


5.1 


1.1 


Parsing, n=8, r=2 


7.8 


1.2 


Parsing, n=12, r=2 


17.0 


1.5 


Grammar 3 


Compilation 


6.5 


0.2 


Parsing, n=4 


0.1 


0.2 


Parsing, n=8 


0.8 


0.3 


Parsing, n=16 


2.8 


1.1 


Parsing, n=32 


26.0 


16.0 



Figure 7: Performance comparison of ALE and 
^Imalia 
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